PAC Reinforcement Learning Algorithm for General-Sum Markov Games

نویسندگان

چکیده

This article presents a theoretical framework for probably approximately correct (PAC) multi-agent reinforcement learning (MARL) algorithms Markov games. Using the idea of delayed Q-learning, this extends well-known Nash Q-learning algorithm to build new PAC MARL general-sum In addition guiding design provably algorithm, enables checking whether an arbitrary is PAC. Comparative numerical results demonstrate algorithm's performance and robustness.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiagent reinforcement learning: algorithm converging to Nash equilibrium in general-sum discounted stochastic games

Reinforcement learning turned out a technique that allowed robots to ride a bicycle, computers to play backgammon on the level of human world masters and solve such complicated tasks of high dimensionality as elevator dispatching. Can it come to rescue in the next generation of challenging problems like playing football or bidding on virtual markets? Reinforcement learning that provides a way o...

متن کامل

Taking turns in general sum Markov games

This paper provides a novel approach to multi-agent coordination in general sum Markov games. Contrary to what is common in multi-agent learning, our approach does not focus on reaching a particular equilibrium between agent policies. Instead, it learns a basis set of special joint agent policies, over which it can randomize to build different solutions. The main idea is to tackle a Markov game...

متن کامل

QL2, a simple reinforcement learning scheme for two-player zero-sum Markov games

Markov games are a framework which formalises n-agent reinforcement learning. For instance, Littman proposed the minimax-Q algorithm to model two-agent zero-sum problems. This paper proposes a new simple algorithm in this framework, QL2, and compares it to several standard algorithms (Q-learning, Minimax and minimax-Q). Experiments show that QL2 converges to optimal mixed policies, as minimax-Q...

متن کامل

Hierarchical Multiagent Reinforcement Learning in Markov Games

Interactions between intelligent agents in multiagent systems can be modeled and analyzed by using game theory. The agents select actions that maximize their utility function so that they also take into account the behavior of the other agents in the system. Each agent should therefore utilize some model of the other agents. In this paper, the focus is on the situation which has a temporal stru...

متن کامل

Value-function reinforcement learning in Markov games

Markov games are a model of multiagent environments that are convenient for studying multiagent reinforcement learning. This paper describes a set of reinforcement-learning algorithms based on estimating value functions and presents convergence theorems for these algorithms. The main contribution of this paper is that it presents the convergence theorems in a way that makes it easy to reason ab...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Automatic Control

سال: 2023

ISSN: ['0018-9286', '1558-2523', '2334-3303']

DOI: https://doi.org/10.1109/tac.2022.3219340